home *** CD-ROM | disk | FTP | other *** search
Text File | 1990-11-25 | 17.4 KB | 552 lines | [TEXT/PICN] |
- .so tmac.tr
- .DA "January 1, 1990; last revised July 4, 1990"
- .TR 90-6b
- .Gl
- .TL
- An Overview of the Icon Programming Language; Version 8
- .AU
- Ralph E. Griswold
- .AE
- .tr *\(**
- .NH
- Introduction
- .PP
- Icon is a high-level programming language with extensive facilities for
- processing strings and structures. Icon has several novel features, including
- expressions that may produce sequences of results, goal-directed
- evaluation that automatically searches for a successful result, and
- string scanning that allows operations on strings to be formulated at
- a high conceptual level.
- .PP
- Icon emphasizes high-level string
- processing and a design philosophy that allows ease of programming
- and short, concise programs. Storage allocation and
- garbage collection are automatic in Icon, and there are few restrictions on the
- sizes of objects. Strings, lists, and other structures are created
- during program execution and their size does not need to be known when
- a program is written.
- Values are converted to expected types automatically; for example,
- numeral strings read in as input can be used in numerical computations
- without explicit conversion.
- Icon has an expression-based syntax with reserved words;
- in appearance, Icon programs resemble those of Pascal and C.
- .PP
- Although Icon has extensive facilities for processing strings and structures,
- it also has a full repertoire of computational facilities. It is suitable
- for a wide variety of applications.
- Some examples are:
- .in .5i
- .IP \(bu
- text analysis, editing, and formatting
- .IP \(bu
- document formating
- .IP \(bu
- artificial intelligence
- .IP \(bu
- expert systems
- .IP \(bu
- rapid prototyping
- .IP \(bu
- symbolic mathematics
- .IP \(bu
- text generation
- .IP \(bu
- data laundry
- .in 0
- .PP
- There are public-domain implementations of Icon for the Amiga, the Atari ST,
- the Macintosh under MPW, MS-DOS, MVS, OS/2, many UNIX systems, VM/CMS, and
- VMS. There also is a commercial implementation of an enhanced version of
- Icon for the Macintosh
- [.proicon.].
- .PP
- The remainder of this report briefly describes the highlights of Icon.
- For a complete description, see [.ilbook, .tr90-1.].
- .NH
- Expression Evaluation
- .NH 2
- Conditional Expressions
- .PP
- In Icon there are \fIconditional expressions\fR that may \fIsucceed\fR and
- produce a result, or may \fIfail\fR and not produce any result. An example
- is the comparison operation
- .Ds
- i > j
- .De
- which succeeds (and produces the value of \*Mj\fR) provided that the value
- of \*Mi\fR is greater than the value of \*Mj\fR, but fails otherwise.
- Similarly,
- .Ds
- i > j > k
- .De
- succeeds if the value of \*Mj\fR is between \*Mi\fR and \*Mk\fR.
- .PP
- The success or failure of conditional operations is used instead of
- Boolean values to drive control structures in Icon. An example is
- .Ds
- if i > j then k := i else k := j
- .De
- which assigns the value of \*Mi\fR to \*Mk\fR if the value of \*Mi\fR
- is greater than the value of \*Mj\fR, but assigns the value of \*Mj\fR to
- \*Mk\fR \%otherwise.
- .PP
- The usefulness of the concepts of success and failure is illustrated by
- \*Mfind(s1,s2)\fR, which fails
- if \*Ms1\fR does not occur as a substring of \*Ms2\fR.
- Thus
- .Ds
- if i := find("or",line) then write(i)
- .De
- writes the position at which \*M"or"\fR occurs in \*Mline\fR, if it occurs,
- but does not write a value if it does not occur.
- .PP
- Many expressions in Icon are conditional. An example is \*Mread()\fR,
- which produces the next line from the input file, but fails when the
- end of the file is reached. The following expression is typical of
- programming in Icon and illustrates the integration of conditional
- expressions and conventional control structures:
- .Ds
- while line := read() do
- write(line)
- .De
- This expression copies the input file to the output file.
- .PP
- If an argument of a function fails, the function is not called,
- and the function call fails as well. This ``inheritance'' of failure allows the
- concise formulation of many programming tasks. Omitting the optional
- \*Mdo\fR clause in \*Mwhile-do\fR, the previous expression can be
- rewritten as
- .Ds
- while write(read())
- .De
- .NH 2
- Generators
- .PP
- In some situations, an expression may be capable of producing more than
- one result. Consider
- .Ds
- sentence := "Store it in the neighboring harbor"
- find("or",sentence)
- .De
- Here \*M"or"\fR occurs in \*Msentence\fR at positions 3, 23, and 33. Most
- programming languages treat this situation by selecting one of the
- positions, such as the first, as the result of the expression. In Icon,
- such an expression is a \fIgenerator\fR and is capable of producing
- all three positions.
- .PP
- The results that a generator produces depend on context. In a situation
- where only one result is needed, the first is produced, as in
- .Ds
- i := find("or",sentence)
- .De
- which assigns the value 3 to \*Mi\fR.
- .PP
- If the result produced by a generator does not lead to the success of
- an enclosing expression, however, the generator is \fIresumed\fR
- to produce another value. An example is
- .Ds
- if (i := find("or",sentence)) > 5 then write(i)
- .De
- Here the first result produced by the generator, 3, is assigned to
- \*Mi\fR, but this value is not greater than 5 and the comparison
- operation fails. At this point, the generator is resumed and
- produces the second position, 23, which is greater than 5. The
- comparison operation then succeeds and the value 23 is written.
- Because of the inheritance of failure and the fact that comparison
- operations return the value of their right argument, this expression
- can be written in the following more compact form:
- .Ds
- write(5 < find("or",sentence))
- .De
- .PP
- Goal-directed evaluation is inherent in the expression evaluation
- mechanism of Icon and can be used in arbitrarily complicated situations.
- For example,
- .Ds
- find("or",sentence1) = find("and",sentence2)
- .De
- succeeds if \*M"or"\fR occurs in \*Msentence1\fR at the same position
- as \*Mand\fR occurs in \*Msentence2\fR.
- .PP
- A generator can be resumed repeatedly to produce all its results by
- using the \*Mevery-do\fR control structure. An example is
- .Ds
- every i := find("or",sentence)
- do write(i)
- .De
- which writes all the positions at which \*M"or"\fR occurs in \*Msentence\fR.
- For the example above, these are 3, 23, and 33.
- .PP
- Generation is inherited like failure, and this expression can be written
- more concisely by omitting the optional \*Mdo\fR clause:
- .Ds
- every write(find("or",sentence))
- .De
- .PP
- There are several built-in generators in Icon. One of the most frequently
- used of these is
- .Ds
- i to j
- .De
- which generates the integers from \*Mi\fR to \*Mj\fR. This generator can be
- combined with \*Mevery-do\fR to formulate the traditional \*Mfor\fR-style
- control structure:
- .Ds
- every k := i to j do
- f(k)
- .De
- Note that this expression can be written more compactly as
- .Ds
- every f(i to j)
- .De
- .PP
- There are a number of other control structures related to generation.
- One is \fIalternation\fR,
- .Ds
- \*1 | \*2
- .De
- which generates the results of \*1 followed by the results of \*2.
- Thus
- .Ds
- every write(find("or",sentence1) | find("or",sentence2))
- .De
- writes the positions of \*M"or"\fR in \*Msentence1\fR followed by
- the positions of \*M"or"\fR in \*Msentence2\fR. Again, this sentence can
- be written more compactly by using alternation in the second
- argument of \*Mfind()\fR:
- .Ds
- every write(find("or",sentence1 | sentence2))
- .De
- .PP
- Another use of alternation is illustrated by
- .Ds
- (i | j | k) = (0 | 1)
- .De
- which succeeds if any of \*Mi\fR, \*Mj\fR, or \*Mk\fR has the value 0 or 1.
- .PP
- Procedures can be used to add generators to Icon's built-in repertoire. For example,
- .Ds
- procedure findodd(s1,s2)
- every i := find(s1,s2) do
- if i % 2 = 1 then suspend i
- end
- .De
- is a procedure that generates the odd-valued positions at which \*Ms1\fR occurs
- in \*Ms2\fR. The \*Msuspend\fR control structure returns a value from the
- procedure, but leaves it in suspension so that it can be resumed for another
- value. When the loop terminates, control flows off the end of the
- procedure without producing another value.
- .NH
- String Scanning
- .PP
- For complicated operations, the bookkeeping involved in keeping track of
- positions in strings becomes burdensome and error prone.
- Icon has a string scanning facility that is
- manages positions automatically.
- Attention is
- focused on a current position in a string as it is examined by a sequence of
- operations.
- .PP
- The string scanning operation has the form
- .Ds
- s ? \*0
- .De
- where \*Ms\fR is the \fIsubject\fR string to be examined and \*0 is an expression that
- performs the examination.
- A position in the subject, which starts at 1, is the focus of examination.
- .PP
- \fIMatching functions\fR change this position.
- One matching function, \*Mmove(i)\fR, moves the position by \*Mi\fR and
- produces the substring of the subject between the previous and new
- positions. If the position cannot be moved by the specified amount
- (because the subject is not long enough), \*Mmove(i)\fR fails. A
- simple example is
- .Ds
- line ? while write(move(2))
- .De
- which writes successive two-character substrings of \*Mline\fR, stopping
- when there are no more characters.
- .PP
- Another matching function is \*Mtab(i)\fR, which sets the position in the
- subject to \*Mi\fR and also returns the substring of the subject between
- the previous and new positions.
- For example,
- .Ds
- line ? if tab(10) then write(tab(0))
- .De
- first sets the position in the subject to 10 and then to the end of the subject, writing
- \*Mline\^[10:0]\fR.
- Note that no value is written if the subject is not long enough.
- .PP
- String analysis functions such as \*Mfind()\fR
- can be used in string scanning. In this context, the string that they
- operate on is not specified and is taken to be the subject. For example,
- .Ds
- line ? while write(tab(find("or")))
- do move(2)
- .De
- writes all the substrings of \*Mline\fR prior to occurrences of \*M"or"\fR.
- Note that \*Mfind()\fR produces a position, which is then used by \*Mtab\fR
- to change the position and produce the desired substring. The \*Mmove(2)\fR
- skips the \*M"or"\fR that is found.
- .PP
- Another example of the use of string analysis functions in scanning is
- .Ds
- line ? while tab(upto(&letters)) do
- write(tab(many(&letters)))
- .De
- which writes all the words in \*Mline\fR.
- .PP
- As illustrated in the examples above, any expression may occur in
- the scanning expression.
- .NH
- Structures
- .PP
- Icon supports several kinds of structures with different organizations
- and access methods. Lists are linear structures that can be accessed
- both by position and by stack and queue functions. Sets are collections
- of arbitrary values with no implied ordering. Tables provide an
- associative lookup mechanism.
- .NH 2
- Lists
- .PP
- While strings are sequences of characters, lists in Icon are sequences
- of values of arbitrary types. Lists are created by enclosing the lists
- of values in brackets. An example is
- .Ds
- car1 := ["buick","skylark",1978,2450]
- .De
- in which the list \*Mcar1\fR has four values, two of which are strings
- and two of which are integers. Note that the values in a list need not
- all be of the same type. In fact, any kind of value can occur in a list
- \(em even another list, as in
- .Ds
- inventory := [car1,car2,car3,car4]
- .De
- .PP
- Lists also can be created by
- .Ds
- L := list(i,x)
- .De
- which creates a list of \*Mi\fR values, each of which has the value
- \*Mx\fR.
- .PP
- The values in a list can be referenced by position much like the
- characters in a string. Thus
- .Ds
- car1\^[4] := 2400
- .De
- changes the last value in \*Mcar1\fR to 2400.
- A reference that is out of the range of the list fails. For example,
- .Ds
- write(car1\^[5])
- .De
- fails.
- .PP
- The values in a list \*ML\fR are generated by \*M!L\fR. Thus
- .Ds
- every write(!L)
- .De
- writes all the values in \*ML\fR.
- .PP
- Lists can be manipulated like stacks and queues. The function
- \*Mpush(L,x)\fR
- adds the value of \*Mx\fR to the left end of the list \*ML\fR,
- automatically increasing the size of \*ML\fR by one. Similarly,
- \*Mpop(L)\fR removes the leftmost value from \*ML\fR, automatically
- decreasing the size of \*ML\fR by one, and produces the removed value.
- .NH 2
- Sets
- .PP
- A sets is a collection of values.
- An empty set is created by \*Mset()\fR.
- Alternatively, \*Mset(L)\fR produces a set with the values in the list \*ML\fR.
- For example,
- .Ds
- S := set(\^[1,"abc",[\^]\^])
- .De
- assigns to \*MS\fR a set that contains the integer 1, the string \*M"abc"\fR,
- and an empty list.
- .PP
- The set operations of union, intersection, and difference are provided.
- The function \*Mmember(S,x)\fR succeeds if \*Mx\fR is a member of the
- set \*MS\fR but fails otherwise. The function \*Minsert(S,x)\fR
- adds \*Mx\fR to the set \*MS\fR,
- while \*Mdelete(S,x)\fR removes \*Mx\fR from \*MS\fR. A value only can occur once in
- a set, so \*Minsert(S,x)\fR has no effect if \*Mx\fR is already in
- \*MS\fR.
- \*M!S\fR generates the members of \*MS\fR.
- .PP
- A simple example of the use of sets is given by the following
- segment of code, which lists all the different words that
- appear in the input file:
- .Ds
- words := set()
- while line := read() do
- line ? while tab(upto(&letters)) do
- insert(words,tab(many(&letters)))
- every write(!words)
- .De
- .NH 2
- Tables
- .PP
- Tables are sets of pairs each of which consists of a key and a corresponding
- value. The key and its corresponding value may be of any type,
- and the value for any key can be looked up automatically.
- Thus, tables provide a form of associative access in contrast with the
- positional access to values in lists.
- .PP
- A table is created by an expression such as
- .Ds
- symbols := table(0)
- .De
- which assigns to \*Msymbols\fR a table with the default value 0.
- The default value is used for keys that are not assigned another value.
- Subsequently, \*Msymbols\fR can be referenced by any key, such as
- .Ds
- symbols\^["there"] := 1
- .De
- which assigns the value 1 to the key \*M"there"\fR in \*Msymbols\fR.
- .PP
- Tables grow automatically as new keys are added.
- For example, the following program segment produces a
- table containing a
- count of the
- words that appear in the input file:
- .Ds
- words := table(0)
- while line := read() do
- line ? while tab(upto(&letters)) do
- words\^[tab(many(&letters))] +:= 1
- .De
- Here the default value for each word is 0, as given
- in \*Mtable(0)\fR, and \*M+:=\fR is an augmented assignment operation that
- increments the values by one.
- There are augmented assignment operations for all binary operators.
- .PP
- A list can be obtained from a table \*MT\fR by the function \*Msort(T,1)\fR.
- The form of the list depends on the value of \*Mi\fR. For example, if
- \*Mi\fR is 3, the list contains alternate
- keys and their corresponding values in \*MT\fR.
- For example,
- .Ds
- wordlist := sort(words,3)
- while write(pop(wordlist)," : ",pop(wordlist))
- .De
- writes the words and their counts from \*Mwords\fR.
- .NH
- An Example
- .PP
- The following program, which produces a concordance of the words from an
- input file, illustrates typical Icon programming techniques. Although not all
- the features in this program are described in previous sections, the
- general idea should be clear.
- .de Ta
- .ta 3.i
- ..
- .Ds
- global uses, lineno, width
-
- procedure main(args)
- width := 15 # width of word field
- uses := table()
- lineno := 0
- every tabulate(words()) # tabulate all the citations
- output() # print the citations
- end
- .Dd
- # Add line number to citations for word
- #
- procedure tabulate(word)
- /uses\^[word] := table()
- uses\^[word]\^[lineno] := 1
- return
- end
- .Dd
- # Generate words
- #
- procedure words()
- while line := read() do {
- lineno +:= 1
- write(right(lineno,6)," ",\*bline)
- map(line) ? while tab(upto(&letters)) do {
- s := tab(many(&letters))
- if *s < 3 then next # skip short words
- suspend s
- }
- }
- end
- .Dd
- # Print the results
- #
- procedure output()
- write() # blank line
- uses := sort(uses,3) # sort citations
- while word := get(uses) do {
- line := ""
- numbers := sort(get(uses),3)
- while line ||:= get(numbers) || ", " do
- get(numbers) # skip marking value
- write(left(word,width),\*bline\^[1:-2])
- }
- end
- .De
- The program reads a line, writes it out with an identifying line number,
- and processes every word in the line. Words less than three characters
- long are considered to be ``noise'' and are discarded. A table,
- \*Muses\fR, is keyed by the words. Every key has a corresponding set
- of line numbers. The first time a word is encountered, a new set is created
- for it. The line number is inserted in any event. Since a value can be in a set
- only once, duplicate line numbers are suppressed automatically.
- .PP
- After all the input has been read, the table of words is sorted by key.
- Each corresponding set of line numbers is sorted and the line numbers
- are appended to the line to be written.
- .PP
- For example, if the input file is
- .Ds
- On the Future!-how it tells
- Of the rapture that impells
- To the swinging and the ringing
- Of the bells, bells, bells-
- Of the bells, bells, bells, bells,
- Bells, bells, bells-
- To the rhyming and the chiming of the bells!
- .De
- the output is
- .Ds
- .ta .7i
- 1 On the Future!-how it tells
- 2 Of the rapture that impells
- 3 To the swinging and the ringing
- 4 Of the bells, bells, bells-
- 5 Of the bells, bells, bells, bells,
- 6 Bells, bells, bells-
- 7 To the rhyming and the chiming of the bells!
-
- .ta 1i
- and 3, 7
- bells 4, 5, 6, 7
- chiming 7
- future 1
- how 1
- impells 2
- rapture 2
- rhyming 7
- ringing 3
- swinging 3
- tells 1
- that 2
- the 1, 2, 3, 4, 5, 7
- .De
- .SH
- Acknowledgement
- .PP
- Icon was designed by the the author in collaboration with Dave Hanson,
- Tim Korb, Cary Coutant, and Steve Wampler. Many other persons have
- contributed to its development. The current implementation is
- based on the work of Cary Coutant and Steve Wampler with recent
- contributions by Bill Mitchell, Janalee O'Bagy, Gregg Townsend, and
- Ken Walker.
- .[]
-